Persistent Snapshot Methods

ABSTRACT

A persistent snapshot is taken and maintained in accordance with a novel method and system for extended periods of time using only a portion of a computer readable medium of which the snapshot is taken. Multiple snapshots can be taken in succession at periodic intervals and maintained practically indefinitely. The snapshots are maintained even after powering down and rebooting of the computer system. The state of the object of the snapshot for each snapshot preferably is accessible via a folder on volume of the snapshot. A restore of a file or folder may be accomplished by merely copying that file or folder from the snapshot folder to a current directory of the volume. Alternatively, the entire computer system may be restored to a previous snapshot state thereof. Snapshots that occurred after the state to which the computer is restored are not lost in the restore operation. Different rule sets and scenarios can be applied to each snapshot. Furthermore, each snapshot can be written to within the context of the snapshot and later restored to its pristine condition. Software for implementing the systems and methods of snapshots in accordance with the present invention may comprise firmware of a hard disk drive controller or a disk controller board or within the HDD casing itself. The present invention further comprises novel systems and methods in which the systems and methods of taking and maintaining snapshots are utilized in creating and managing temporal data stores, including temporal database management systems. The implications for data mining and exploration, data analysis, intelligence gathering, and artificial intelligence (just to name a few areas) are profound.

CROSS REFERENCE TO RELATED APPLICATIONS

This is a continuation-in-part patent application that claims thebenefit under 35 U.S.C. §120 to the filing dates of: U.S. nonprovisionalpatent application Ser. No. 10/248,483, titled, “Persistent SnapshotManagement System,” filed Jan. 22, 2003, which is a nonprovisional ofU.S. provisional patent application Ser. No. 60/350,434, titled,“Persistent Snapshot Management System,” filed Jan. 22, 2002; and U.S.nonprovisional patent application Ser. No. 10/349,474, titled,“Persistent Snapshot Management System,” filed Jan. 22, 2003, which is anonprovisional of U.S. provisional patent application Ser. No.60/350,434, titled, “Persistent Snapshot Management System,” filed Jan.22, 2002. Each of these U.S. patent applications is hereby incorporatedherein by reference.

APPENDIX DATA

Program Source Code Code.txt includes 83598 lines of code representingan implementation of a preferred embodiment of the present invention.The programming language is C++ and is intended to run on the Windows2000 operating system. This program source code is incorporated hereinby reference as part of the disclosure.

BACKGROUND OF INVENTION

Data of a computer system generally is archived on a periodic basis,such as at the end of each day; at the end of each week; at the end ofeach month; and/or at the end of each year. Data may also be archivedbefore or after certain events or actions. When archived, the data islogically consistent, i.e., all of the data subjected to the archivingprocess at any point in time is maintained in the state as it existed atthat particular point in time.

The archived data provides a means for restoring a computer system to aprevious, known state, which may be necessary when performing disasterrecovery such as occurs when data in a primary storage system is lost orcorrupted. Data may be lost or corrupted if the primary storage system,such as a hard disk drive or other mass storage system, is physicallydamaged, if the operating system of the primary storage system crashes,or if files of the primary storage system are infected by a computervirus. By archiving the data on a periodic basis, the computer systemalways can be restored to its state as it existed at the most recentbackup time, thereby minimizing any permanent data loss should disasterrecovery actually be performed. The restoration may be of one or morefiles of the computer system or of the entire computer system itself.

There are numerous types of methods for archiving data. One typeincludes the copying of the data subject to the archive to a backupstorage system. Typically, the backup storage system includes backupmedium comprising magnetic computer tapes or optical disks used to storebackup copies of large amounts of data, as is often associated withcomputer systems. Furthermore, each backup tape or optical disk can bemaintained in storage indefinitely by sending it offsite. In order tominimize costs, such tapes and disks also can be reused on a rollingbasis if such backup medium is rewriteable, or destroyed if notrewriteable and physical storage space for the backups is limited. Inthis later scenario, the “first in-first out” methodology is utilized inwhich the tape or disk having the oldest recording date is destroyedfirst.

One disadvantage to archiving data by making backups is that the datasubject to the archiving process is copied in totalityonto the backupmedium. Thus, if 250 gigabytes of data is to be archived, then 250gigabytes of storage capacity is required. If a terabyte of data is tobe backed up, then a terabyte of storage capacity is required. Anotherrelated disadvantage is that as the amount of data to be archivedincreases, the period of time required to perform the backup increasesas well. Indeed, it may take weeks to archive onto tape a terabyte ofdata. Likewise, it may take weeks if it becomes necessary to restoresuch amount of data.

Yet another disadvantage is that sometimes an “incremental” backup ismade, wherein only the new data that has been written since the lastbackup is actually copied to the backup medium. This is in contrast tothe “complete” backup of the data, wherein all the data subject to thearchiving process is copied whether or not it is new. Restoring archiveddata from complete and incremental backups requires copying from acomplete backup and then copying from the incremental backups thereaftermade between the time point of the complete backup until the time pointof the restoration. A fourth and obvious disadvantage is that when thebackup medium in the archiving process is stored offline, the archiveddata must be physically retrieved and mounted for access and, thus, isnot readily available on demand.

In view of the foregoing, it will be apparent that it is extremelyinefficient to utilize backups for restoring data when, for example,only a particular user file or some other limited subset of the backupis required. To address this concern, a snapshot can be taken of datawhereby an image of the data at the particular snapshot moment can laterbe accessed. The object of the snapshot for which the image is providedmay be of a file, a group of files, a volume or logical partition, or anentire storage system. The snapshot may also be of a computer-readablemedium, or portion thereof, and the snapshot may be implemented at thefile level or at the storage system block level. In either case, thedata of the snapshot is maintained for later access by (1) savingsnapshot data before replacement thereof by new data in a “copy onwrite” operation, and (2) keeping track of all the snapshot data,including the snapshot data still residing in the original location atthe snapshot moment as well as the snapshot data that has been savedelsewhere in the copy-on-write operation. Typically, the snapshot datathat is saved in the copy-on-write operation is stored in a speciallyallocated area on the same storage medium as the object of the snapshot.This area typically is a finite storage data of fixed capacity.

The use of snapshots has advantages over the archiving process because abackup medium separate and apart from a primary storage medium is notrequired, and the snapshot data is stored online and, thus, readilyaccessible. A snapshot also only requires storage capacity equal to thatamount of data that is subjected to the copy-on-write operation; thus,all of the snapshot data need not be saved to a specifically allocateddata storage area if all of the snapshot data is not to be replaced. Thetaking of a snapshot also is near instantaneous.

Advantageously, a snapshot may also be utilized in creating a backupcopy of a primary storage medium onto a backup medium, such as a tape.As disclosed, for example, in Ohran U.S. Pat. No. 5,649,152, a snapshotcan be taken of a base “volume” (a/k/a a “logical drive”), and then atape backup can be made by reading from and copying the snapshot ontotape. During this archive process, reads and writes to the base volumecan continue without waiting for completion of the archive processbecause the snapshot itself is a non-changing image of the data of thebase volume as it existed at the snapshot moment. The snapshot in thisinstance thus provides a means by which data can continue to be readfrom and written to the primary storage medium while the backup processconcurrently runs. Once the backup is created, the snapshot is releasedand the resources that were used for taking and maintaining the snapshotare made available for other uses by the computer system.

A disadvantage to utilizing snapshots is that a snapshot is not aphysical duplication of the data of the object of the snapshot onto abackup medium. A snapshot is not a backup. Furthermore, if the storagemedium on which the original object of the snapshot resides isphysically damaged, then both the object and the snapshot can be lost. Asnapshot, therefore, does not provide protection against physical damageof the storage medium itself.

A snapshot also requires significant storage capacity if it is to bemaintained over an extended period of time, since snapshot data is savedbefore being replaced and, over the course of an extended period oftime, much of the snapshot data may need saving. The storage capacityrequired to maintain the snapshot also dramatically increases asmultiple snapshots are taken and maintained. Each snapshot may requirethe saving of overlapping snapshot data, which accelerates consumptionof the storage capacity allocated for snapshot data. In an extreme case,each snapshot ultimately will require a storage capacity equal to theamount of data of its respective object. This is problematic as thestorage capacity of any particular storage medium is finite and,generally, the finite data storage will not have sufficient capacity toaccommodate this, leading to failure of the snapshot system.

Accordingly, snapshots generally are used solely for transientapplications, wherein, after the intended purpose for which the snapshotis taken has been achieved, the snapshot is released and systemresources freed, perhaps for the provision of a subsequent snapshot.Furthermore, because snapshots are only needed for temporary purposes,the means for tracking the snapshot data may be stored in RAM memory ofa computer and is lost upon the powering down or loss of power of thecomputer, and, consequently, the snapshot is lost. In contrast thereto,backups are used for permanent data archiving.

Accordingly, a need exists for an improved system and method that, butfor protection against physical damage to the storage medium itself,provides the combined benefits of both snapshots and backups without thetime and storage capacity constraints associated with snapshots andbackups. One or more embodiments of the present invention meet this andother needs, as will become apparent from the detailed descriptionthereof below and consideration of the computer source code incorporatedherein by reference and disclosed in the incorporated provisional U.S.patent application.

SUMMARY OF INVENTION

Briefly described, the invention comprises a snapshot management system.

BRIEF DESCRIPTION OF DRAWINGS

Further features and benefits of the present invention will be apparentfrom a detailed description of preferred embodiments thereof taken inconjunction with the following drawings, wherein similar elements arereferred to with similar reference numbers, and wherein,

FIG. 1 is an overview of an exemplary operating environment for use withpreferred embodiments of the present invention;

FIG. 2 is an overview of a preferred system of the present invention;

FIG. 3 is a graphical illustration of a first series of exemplarydisk-level operations performed by a preferred snapshot system of thepresent invention;

FIG. 4 is a graphical illustration of a series of exemplary disk-leveloperations performed by a prior art snapshot system;

FIG. 5 is a flowchart showing method performed by a preferred embodimentof the present invention implementing the operations of FIG. 3;

FIGS. 6 a and 6 b are graphical illustration of a second series ofexemplary disk-level operations performed by a preferred snapshot systemof the present invention;

FIG. 7 is a graphical illustration of a third series of exemplarydisk-level operations performed by a preferred snapshot system of thepresent invention;

FIG. 8 is a state diagram showing a preferred embodiment of the presentinvention implementing the operations of FIG. 7;

FIG. 9 is a flowchart showing method performed by a preferred embodimentof the present invention implementing the operations of FIG. 7;

FIGS. 10 a and 10 b are graphical illustration of a fourth series ofexemplary disk-level operations performed by a preferred snapshot systemof the present invention;

FIG. 11 is a flowchart illustrating a preferred secure copy-on-writemethod as used by preferred embodiments of the present invention;

FIGS. 12-32 illustrate user screen shots of a preferred implementationof the methods and systems of the present invention;

FIG. 33 is a graphical illustration of a series of exemplary disk-leveloperations performed by a preferred snapshot system of the presentinvention;

FIG. 34 is a diagram showing associations of various aspects of apreferred system of the present invention;

FIG. 35 is a diagram showing information contained in various componentsof a preferred system of the present invention;

FIG. 36 is a flowchart showing method performed by a preferredembodiment of the present invention;

FIG. 37 is a screen shot of an exemplary user interface for use by apreferred embodiment of the present invention;

FIG. 38 is a screen shot of another exemplary user interface for use bya preferred embodiment of the present invention;

FIG. 39 is a screen shot of another exemplary user interface for use bya preferred embodiment of the present invention;

FIG. 40 is a screen shot of a folder tree as used by a preferredembodiment of the present invention;

FIG. 41 is a screen shot of another folder tree as used by a preferredembodiment of the present invention;

FIG. 42 is a screen shot of yet another folder tree as used by apreferred embodiment of the present invention;

FIG. 43 is a firmware implementation of a preferred embodiment of thepresent invention;

FIG. 44 is another firmware implementation of a preferred embodiment ofthe present invention; and

FIG. 45 is yet another firmware implementation of a preferred embodimentof the present invention.

DETAILED DESCRIPTION

As a preliminary matter, it will readily be understood by those personsskilled in the art that the present invention is susceptible of broadutility and application in view of the following detailed description ofpreferred embodiments of the present invention. Many devices, methods,embodiments, and adaptations of the present invention other than thoseherein described, as well as many variations, modifications, andequivalent arrangements thereof, will be apparent from or reasonablysuggested by the present invention and the following detaileddescription thereof, without departing from the substance or scope ofthe present invention. Accordingly, while the present invention isdescribed herein in detail in relation to preferred embodiments, it isto be understood that this disclosure is illustrative and exemplary andis made merely for purposes of providing a full and enabling disclosureof preferred embodiments of the invention. The disclosure herein is notintended nor is to be construed to limit the present invention orotherwise to exclude any such other embodiments, adaptations,variations, modifications and equivalent arrangements, the presentinvention being limited only by the claims appended hereto or presentedin any continuing application, and the equivalents thereof.

Exemplary Operating Environment

FIG. 1 and the following discussion are intended to provide a brief,general description of a suitable computing environment in which thepresent invention may be implemented. While the invention will bedescribed in the general context of an application program that runs onan operating system in conjunction with a server or personal computer,those skilled in the art will recognize that the invention also may beimplemented in combination with other program modules. Generally,program modules include routines, programs, components, data structures,and the like that perform particular tasks or implement particularabstract data types. Moreover, those skilled in the art will appreciatethat the invention may be practiced with other computer systemconfigurations, including hand held devices, multiprocessor systems,microprocessor based or programmable consumer electronics,minicomputers, mainframe computers, and the like. The present inventionmay also be practiced in distributed computing environments where tasksare performed by remote processing devices that are linked through acommunications network. In a distributed computing environment, programmodules may be located in both local and remote memory storage devices.

With reference to FIG. 1, an exemplary system for implementing theinvention includes a conventional personal or server computer 20,including a processing unit 21, a system memory 22, and a system bus 23that couples the system memory to the processing unit 21. The systemmemory 22 includes read only memory (ROM) 24 and random access memory(RAM) 25. A basic input/output system 26 (BIOS), containing the basicroutines that help to transfer information between elements within thecomputer 20, such as during startup, is stored in ROM 24. The computer20 further includes a hard disk drive 27, a magnetic disk drive 28,e.g., to read from or write to a removable disk 29, and an optical diskdrive 30, e.g., for reading a CDR disk 31 or to read from or write toother optical media. The hard disk drive 27, magnetic disk drive 28, andoptical disk drive 30 are connected to the system bus 23 by a hard diskdrive interface 32, a magnetic disk drive interface 33, and an opticaldrive interface 34, respectively. The drives and their associatedcomputer readable media provide nonvolatile storage for the computer 20.Although the description of computer readable media above refers to ahard disk, a removable magnetic disk, and a CDR disk, it should beappreciated by those skilled in the art that other types of media whichare readable by a computer, such as magnetic cassettes, flash memorycards, digital video disks (DVDs), Bernoulli cartridges, and the like,may also be used in the exemplary operating environment.

A number of program modules may be stored in the drives and RAM 25,including an operating system 35, one or more application programs 36,the Persistent Storage Manager (PSM) module 37, and program data 38. Auser may enter commands and information into the computer 20 through akeyboard 40 and pointing device, such as a mouse 42. Other input devices(not shown) may include a microphone, joystick, game pad, satellitedish, scanner, or the like. These and other input devices are oftenconnected to the processing unit 21 through a serial port interface 46that is coupled to the system bus 23, but may be connected by otherinterfaces, such as a game port or a universal serial bus (USB). Amonitor 47 or other type of display device is also connected to thesystem bus 23 via an interface, such as a video adapter 48. In additionto the monitor 47, computers typically include other peripheral outputdevices (not shown), such as speakers or printers.

The computer 20 may operate in a networked environment using logicalconnections to one or more remote computers, such as a remote computer49. The remote computer 49 may be a server, a router, a peer device, orother common network node, and typically includes many or all of theelements described relative to the computer 20, although only a memorystorage device 50 has been illustrated in FIG. 1. The logicalconnections depicted in FIG. 1 include a local area network (LAN) 51 anda wide area network (WAN) 52. Such networking environments arecommonplace in offices, enterprise wide computer networks, intranets andthe Internet.

When used in a LAN networking environment, the computer 20 is connectedto the LAN 51 through a network interface 53. When used in a WANnetworking environment, the computer 20 typically includes a modem 54 orother means for establishing communications over the WAN 52, such as theInternet. The modem 54, which may be internal or external, is connectedto the system bus 23 via the serial port interface 46. In a networkedenvironment, program modules depicted relative to the computer 20, orportions thereof, may be stored in the remote memory storage device. Itwill be appreciated that the network connections shown are exemplary andother means of establishing a communication's link between the computersmay be used.

Exemplary Snapshot System

Turning now to FIG. 2, an exemplary snapshot system of the presentinvention is illustrated. The purpose of a snapshot system is tomaintain the saved “current state” of memory of a computer system, orsome portion thereof. Typically, a snapshot is periodically “taken” sothat a computer system can be restored in the event of failure. At thefile level, snapshots enable previous versions of files to be broughtback for review or to be placed back into use should that becomenecessary. As will be seen herein, the snapshot system of the presentinvention provides the above capabilities, and much more.

Such system includes components of a computer system, such as anoperating system 210. The system also includes a persistent storagemanager (PSM) module 220, which performs methods and processes of thepresent invention, as will be explained hereinafter. The system alsoincludes at least one finite data storage medium 230, such as a harddrive or hard disk. The storage medium 230 comprises two dedicatedportions, namely, a primary volume 242 and a cache 244. The primaryvolume 242 contains active user and system data 235. The cache 244contains a plurality of snapshot caches 252, 254, 256 generated by thePSM module 220.

The operating system 210 includes system drivers 212 and a plurality ofmounts 214, 216, 218. The system also includes a user interface 270,such as a monitor or display. The user interface 270 displays snapshotdata 272 in a manner that is meaningful to the user, such as by means ofconventional folders 274, 276, 278. Each folder 274, 276, 278 isgenerated from a respective mount 214, 216, 218 by the operating system210. Each respective folder preferably displays snapshot information ina folder and file tree format 280, as generated by the PSM module 220.Specifically, as will be discussed in greater detail herein, the PSMmodule 220 in conjunction with the operating system 210 is able todisplay current and historical snapshot information by accessing bothactive user and system data 235 and snapshot caches 252, 254, 256maintained on the finite data storage medium 230.

Methods and further processes for taking, maintaining, managing,manipulating, and displaying snapshot data according to the presentinvention will be described in greater detail hereinafter.

Exemplary Disk Level Operations

Referring generally to FIGS. 3, and 5 through 11, a series of exemplarydisk level operations performed by a preferred snapshot system of thepresent invention and associated methods of the same are illustrated.Turning first to FIG. 3, a first set of operations 300 (“Write toVolume”) is shown in which “write” commands to a volume occur and theresulting impact on the snapshot caches are discussed.

FIG. 3 is divided generally into five separate but related sections. Thefirst section 310 illustrates a timeline or time axis beginning on theleft side of the illustration and extending to the right into infinity.The timeline shows only the first twenty two (22) discrete chronologicaltime points along this exemplary timeline. It should be noted that theactual time interval between each discrete chronological time point andwithin each time point may be of any arbitrary duration. The sum of theduration of the chronological time points and of any intervening timeintervals defines the exemplary time duration of the timeline depictedby FIG. 3. As will be discussed in greater detail hereinafter, threesnapshots are taken between time 1 and time 22, namely, at times 5, 11,and 18.

The second section 320 of FIG. 3 graphically illustrates a series ofcommands to “write” new data to a volume of a finite data storagemedium, such as a hard disk. The row numbers 1 through 4 of this grididentify addresses of the volume to which data will be written. Further,each column of this grid corresponds with a respective time point(directly above) from the timeline of section 310. It should beunderstood that a volume generally contains many more than the fouraddresses (rows) shown herein; however, only the first four addresslocations are necessary to describe the functionality of the presentinvention hereinafter. The letters (E, F, G, H, I, J, K, and L), shownwithin this grid, represent specific data for which a command to writesuch specific data to the volume at the corresponding address and at aspecific time point has been received. For example, as shown in thissection 320, a command has been received by the system to write data “E”to address 2 at time 3, to write data “F” to address 3 at time 7, and soon.

The third section 330 of FIG. 3 is also illustrated as a grid, whichidentifies the data values actually stored in the volume at anyparticular point in time. Each grid location identifies a particularvolume granule at a point in time. Again, the row numbers 1 through 4 ofthe grid identify volume addresses and each column corresponds with arespective time point (directly above) from the timeline of section 310.For example, the data values stored in the volume at addresses 1 through4 at time 13 are “AEFG,” the data value stored in the volume at address3 at time 21 is “J,” the data values stored in the volume at addresses 2through 3 at time 4 are “EC,” and so on. Finally, column 335 identifiesthe data stored in the volume as of time 22. Upper case letters are usedherein to identify data that has value, namely, data that has not beendeleted or designated for deletion. In addition, the first time data isadded to the volume, it is shown in bold.

The fourth section 340 of FIG. 3 graphically illustrates each snapshotspecific cache created in accordance with the methods of the presentinvention. For illustrative purposes, only the three snapshot specificcaches corresponding to the first, second, and third snapshots taken attimes 5, 11, and 18, respectively, are shown. Each snapshot specificcache is illustrated in two different manners.

First, like sections 320 and 330, each snapshot specific cache342,344,346 is illustrated as a grid, with rows 1 through 4corresponding to volume address locations 1 through 4 and with eachcolumn corresponding to a respective point in time from the timeline insection 310. Each grid shows how each respective snapshot specific cacheis populated over time. Specifically, it should be understood that asnapshot specific cache comprises potential granules corresponding toeach row of address locations of the volume but only for point of timebeginning when the respective snapshot is taken and ending with the lastpoint of time just prior to the next succeeding snapshot. There is nooverlap in points of time between any two snapshot specific caches.

Thus, each snapshot specific cache grid 342,344,346 identifies what datahas been recorded to that respective cache and when such data wasactually recorded. For example, as shown in the first snapshot specificcache grid 342, data “C” is written to address 3 at time 8 and ismaintained in that address for this first cache thereinafter. Likewise,data “D” is written to address 4 at time 9 and maintained at thataddress for this first cache thereinafter. Correspondingly, in thesecond snapshot specific cache 344, data “G” is written to address 4 attime 14 and maintained at that address for this second cachethereinafter. In the third snapshot specific cache 346, data “A” iswritten to address 1 at time 21 and maintained at that address for thisthird cache thereinafter, data “F” is written to address 3 at time 20and maintained at that address for this third cache thereinafter, anddata “I” is written to address 4 at time 20 and maintained at thataddress for this third cache thereinafter. The shaded granules in eachof the snapshot specific cache grids 342,344,346 merely indicate that nodata was written to that particular address at that particular point intime in that particular snapshot specific cache; thus, no additionalmemory of the data storage medium is used or necessary.

The second manner of illustrating each snapshot specific cache is shownby column 350, which includes the first snapshot specific cache 352, thesecond snapshot specific cache 354, and the third snapshot specificcache 356. As explained previously, each snapshot specific cache onlycomprises potential granules corresponding to each row of addresslocations of the volume for points of time beginning when the respectivesnapshot is taken and ending with the last point of time just prior tothe next succeeding snapshot. In other words, the first snapshot cachewas being dynamically created between times 5 and 10 and actuallychanged from time 8 to time 9; however, at time 11, when the secondsnapshot was taken, the first snapshot cache became permanently fixed,as shown by cache 352. Likewise, the second snapshot cache was beingdynamically created between times 11 and 17 and actually changed fromtime 13 to time 14; however, at time 18, when the third snapshot wastaken, the second snapshot cache became permanently fixed, as shown bycache 354. Finally, the third snapshot cache is still in the process ofbeing dynamically created beginning at time 18, and changed from time 19to time 20 and from time 20 to time 21; however, this cache 356 will notactually become fixed until a fourth snapshot (not shown) is taken atsome point in the future. Thus, even though cache 356 has not yet becomefixed, it can still be accessed and, as of time 22, contains the data asshown.

Further, it should be understood that the shaded granules in each of thesnapshot specific caches 352,354,356 merely indicate that no data waswritten or has yet been written to that particular address when thatparticular cache was permanently fixed in time (for caches 352, 354) oras of time 22 (for cache 356); thus, no additional memory of the datastorage medium has been used or was necessary to create the caches352,354,356. Stated another way, only the data shown in the fifthsection of FIG. 3, table 360, is necessary to identify the first threesnapshot caches 352,354,356 as of time 22.

Although it should be self evident from FIG. 3 how data is written tothe volume and the impact such writes have on the cache in light of whensnapshots are taken, it will nevertheless be helpful to examine theimpacts of each write command shown in section 320 on the system on atime point by time point basis. First, before proceeding with suchanalysis, it should be understood or observed that no write commands areshown at a time point in which a snapshot is taken. This is intentional.In the preferred embodiment of the present invention, to maintain theintegrity of the data on the volume and stored in the cache, whenever awrite command is received by the system, the next snapshot is delayeduntil such write command has been performed and completed.

Now, proceeding with the time point by time point analysis of FIG. 3, attime 1, the data values stored in addresses 1 through 4 of the volumeare previously set to “ABCD.” The status of the system does not changeat time 2.

However, at time 3, a command to write data “E” to address 2 isreceived. Data “E” is written to this address at time 4, replacing data“B.” Data “B” is not written to any snapshot cache because no snapshotshave yet been taken of the volume. Thus, at time 5, when the firstsnapshot is taken, the values of the volume are “AECD.” It should benoted that although the snapshot has been taken at time 5, there is noneed, yet, to record any of the data in the volume to snapshot cachebecause the current volume accurately reflects what the state of thevolume is or was at time 5. Since the volume is still the same as it wasat time 5, nothing changes at time 6.

At time 7, a command to write data “F” to address 3 is received. Data“F” will be replacing data “C” on the volume; however, because data “C”is part of snapshot 1, data “F” is not immediately written to thisaddress. First, data “C” must be written to the first snapshot cache, asshown at time 8 in cache grid 342. Once data “C” has been written to thefirst snapshot cache, data “F” can then be safely written to address 3of the volume, which is shown at the next time point, time 9. Thisprocess is generally described as the “copy on write” process inconventional snapshot parlance. The copy on write process is repeatedfor writing data “G” to the volume and writing data “D” to the firstsnapshot cache but it is staggered in time from the previous copy onwrite process.

The second snapshot is taken at time 11. The volume at that point is“AEFG.” Again, as stated previously, it is at this point that the firstsnapshot cache 342 is permanently fixed, as shown by granules 352. It isno longer necessary to add any further information to this firstsnapshot cache 352.

Continuing with FIG. 3, at time 13, a command to write data “H” toaddress 4 is received. Data “H” will be replacing data “G”; however,because data “G” is part of snapshot 2, data “H” is not immediatelywritten to this address. The copy on write process is performed so thatdata “G” is written to the second snapshot cache at time 14 as shown ingrid 344. Once data “G” has been written to the second snapshot cache,data “H” can be safely written to address 4 of the volume at time 15. Attime 16, a command to write data “I” to address 4 is received.Importantly, it should be noted that data “I” immediately (at time 17)replaces data “H” in the volume and “H” is not written to the snapshotcache. The reason for this is because data “H” was not in the volume atthe point in time at which any of the previous snapshots were taken.Because address 4 of the volume changed twice between snapshots, onlythe starting value of this address is captured by the snapshots.Intermediate data “H” is lost.

The third snapshot is taken at time 18. The volume at that point is now“AEFI.” Again, as stated previously, it is at this point that the secondsnapshot cache 344 is permanently fixed, as shown by granules 354. It isno longer necessary to add any further information to this secondsnapshot cache 354.

At time 19, commands to write data “J” to address 3 and data “K” toaddress 4 are received. Data “J” will be replacing data “F” and data “K”will be replacing data “I”; however, because data “F” is part ofsnapshots 1 and 2 and because data “I” was part of snapshot 2, data “J”and “K” are not immediately written to these addresses. The copy onwrite process is performed for each address so that data “F” and “I” arewritten to the third snapshot cache at time 20 as shown in grid 346.Once this has occurred, data “J” and “K” can be safely written toaddresses 3 and 4, respectively, of the volume at time 21. Theseparticular copy on write procedures are included so that one can easilysee the different state of the cache for addresses 3 and 4 for eachdifferent snapshot cache 352,354,356. Specifically, it was not necessaryto include data “F” as part of the second snapshot cache 354, eventhough it was on the volume at the time of the second snapshot.

Finally, at time 20 a command to write data “L” to address 1 isreceived. Data “L” will be replacing data “A”; however, because data “A”is part of snapshots 1,2,and 3, data “L” is not immediately written tothis address. The copy on write process is performed so that data “A” iswritten to the third snapshot cache at time 21 as shown in grid 346.Once data “A” has been written to the third snapshot cache, data “L” canbe safely written to address 1 of the volume at time 22. This particularcopy on write procedure is included herein to illustrate that, eventhough data “A” was part of snapshots 1 and 2, it did not need to bewritten to cache until it was actually replaced. Further, it is notnecessary to copy data “A” to the first or second snapshot caches352,354 it only needs to be part of the third snapshot cache 356. Again,the third snapshot cache 356 will becomes fixed as soon as the nextsnapshot is taken.

Finally, it should be noted that data “E,” which is part of all threesnapshots is not written to cache because it is never replaced duringthe time duration of FIG. 3.

Turning briefly now to FIG. 4, a set of operations 400 performed by aprior art snapshot system, as implemented by Ohran U.S. Pat. No.5,649,152, is illustrated. For ease of comparison, FIG. 4 is laid out ina similar format to that of FIG. 3. For example, sections 410, 420, and430 of FIG. 4 correspond to sections 310, 320, and 330 of FIG. 3.Further, the state of the volume as of time 22, as shown by column 435in FIG. 4, is the same as the state of the volume as of time 22, asshown by column 335 in FIG. 3. Contrasts between operation 300 of thepresent invention and operation 400 of FIG. 4 (Ohran) are most evidentby comparing, respectively, sections 440, 450, and 460 of FIG. 4 withsections 340, 350, and 360 of FIG. 3.

Unlike the present invention, each snapshot cache 442, 444, and 446begins at its respective time of snapshot (time 5, 11, and 18,respectively) but then continues ad infinitum, as long as the system ismaintaining snapshots in memory, rather than stopping at the point intime just prior to the next snapshot being taken. The result of this isthat the same data is recorded redundantly in each snapshot cache 452,454, and 456. For example, data “A” is stored not only in the thirdsnapshot cache 456 at address 1 but also at address 1 in the first andsecond snapshot caches 452,454, respectively. Likewise, data “F” isstored not only in the third snapshot cache 456 at address 3 but also inthe second snapshot cache 454 also at address 3. The redundancy of thisprior art system is illustrated as well with reference to table 460,which may be contrasted easily with table 360 in FIG. 3. Although theamount of data that must be stored by the prior art system shown intable 460 of FIG. 4 does not appear to be substantially greater thanthat of table 360 in FIG. 3, it should be apparent to one skilled in theart that, with the passage of time, with changes to data stored on thevolume, and as more and more snapshots of the volume are taken, theamount of memory required to store snapshots of the prior art system 400and the amount of redundancy of data storage grows exponentially greaterthan that of the system 300 of the present invention.

Turning now to FIG. 5, a method 500 for performing the first series ofoperations 300 from FIG. 3 are illustrated. First, the system waits(Step 510) until a command is received from the system, from anadministrator of the system, or from a user of the system. If a commandto take a snapshot is received (Step 520), then a new snapshot cache isstarted (Step 530) and the previous snapshot cache, if one exists, isended (Step 540). The process then returns to Step 510 to wait foranother command.

If the determination in Step 520 is negative, then the system determines(Step 550) whether a command to write new data to the volume has beenreceived. If not, then the system returns to Step 510 to wait foranother command. If so, then the system determines (Step 560) whetherthe data on the volume that is going to be overwritten needs to becached. For example, from FIG. 3, data “B” and “H” did not need to becached. On the other hand, data “C,” “D,” “G,” “F,” “I,” and “A,” fromFIG. 3, all needed to be cached. If the determination in Step 560 ispositive, then the data to be overwritten on the volume is written (Step570) to snapshot cache. If the determination in Step 560 is negative orafter Step 570 has been performed, then the new data is written (Step580) to the volume. The process then returns to Step 510 to wait foranother command.

Turning now to FIGS. 6 a and 6 b, a second set of operations 600 a, 600b, respectively, (“Read First and Second Snapshots”) are shown in which“read snapshot” commands are received and the system, by means ofaccessing the current volume and the relevant snapshot caches, is ableto reconstruct what the volume looked like at an historical point intime at which the respective snapshot was taken. FIGS. 6 a and 6 b aredivided generally into three separate but related sections 610, 630,620.

Turning first to FIG. 6 a, the first section 610 illustrates a timelineor time axis. This timeline 610 is the same as the timeline 310previously discussed in FIG. 3. As will be recalled, the first snapshotfrom FIG. 3 was taken at time 5 and, for ease of reference, is shownagain in FIG. 6 a. The second section 630 of FIG. 6 a graphicallyillustrates the volume, as it existed in the past, and the data storedtherein at any particular point in time along timeline 610. Again, thishistorical volume grid 630 is identical to the volume grid 330 from FIG.3. The third section 620 of FIG. 6 a graphically illustrates theoperations that are performed by the system to “read” the first snapshot(i.e., to correctly identify what data was contained in the volume whenthe first snapshot was taken).

Column 637 identifies what data was contained in the volume at time 5,when the first snapshot was taken; however, it is assumed that thesystem only has access to the data from the current volume 635, as itexists immediately after time 22, and to the snapshot caches 652, 654,and 656. Column 670 represents what the system would read as the imageof the first snapshot. Thus, after the proper procedures are performed,column 670 should match column 637.

To determine the data on the volume at the first snapshot, it is firstnecessary to examine the first snapshot cache 652. Each separate addressgranule is examined and, if any granule has any data therein, such datais represented in column 670 and would be read by the system as part ofthe first snapshot. As shown, the first snapshot cache has data “C” ataddress 3 and data “D” at address 4. These are represented in column 670at addresses 3 and 4 respectively.

Next, each address granule for which data has not yet been determinedare considered. Thus, addresses 1 and 2 are considered, but addresses 3and 4 are not considered because values have been determined for thoseaddresses. Accordingly, the second snapshot cache 654 is then examinedin an attempt to determine values for addresses 1 and 2. If eitheraddress has data found in the second snapshot cache 654, then such datais represented in column 670 at its respective address. As illustratedin FIG. 6 a, no data exists in the second snapshot cache 654 for eitherof these two addresses.

This process is repeated for each successive snapshot cache until allsuccessive snapshot caches have been considered or until no value forany address remains undetermined. As shown, addresses 1 and 2 of thethird snapshot cache 656 are next examined, and data “A” from address 1in the third snapshot cache 656 is found and thus represented in address1 of column 670.

Once all snapshot caches have been examined, any addresses for which nodata was found from such snapshot caches is obtained directly from therelevant address(es) of the current volume 635. In this case, data “E”from the current volume at address 2 is represented in column 670 as thevalue for address 2.

As shown, the data 637 as it existed in the volume at time 5 iscorrectly represented in column 670 by following the above process.

Likewise, turning to FIG. 6 b, the ability to reconstruct the data 638in the volume at time 11, when the second snapshot was taken, may bedone in a similar manner to that described with reference to FIG. 6 a.The primary difference between FIGS. 6 a and 6 b is that to reconstructthe volume at the second snapshot, any prior snapshot caches areignored. In this case, the first snapshot cache 652 is irrelevant to theprocess of constructing column 680. The process, thus, begins with thesecond snapshot cache 654 and proceeds in a similar manner to thatdescribed for FIG. 6 a, but with a different outcome. In this manner,the data 638 in the volume at time 11 is correctly reconstructed incolumn 680.

Turning now to FIG. 7, a third set of operations 700 (“Write/Delete toVolume”) is shown in which “write and/or delete” commands to a volumeoccur and the resulting impact on the snapshot caches are discussed.Like FIG. 3, FIG. 7 is divided generally into five separate but relatedsections. The first section 710 illustrates a timeline or time axissimilar to the timeline 310 of FIG. 3; however, the timeline 710 showsonly the first twenty (20) discrete chronological time points along thisexemplary timeline. In contrast with previous FIGS., the three snapshotsshown in FIG. 7 are taken at times 6, 11, and 15.

The second section 720 of FIG. 7 graphically illustrates a series ofcommands to “write” new data to a volume or to “delete” existing datafrom a volume. The letters (E, F, G, H, I, and J), shown within thisgrid, represent specific data for which a command to “write” suchspecific data to the volume at the corresponding address and at aspecific time point has been received. In contrast, a command to deletedata from the volume is illustrated by an address and time granule inthis grid 720 with a slash mark or reverse hash symbol. For example, asshown in this section 720, a command has been received by the system towrite data “E” to address 2 at time 2, to write data “F” to address 3 attime 2, to delete the value of data (whatever data that happens to be)on the volume at address 2 at time 4, and so on.

The third section 730 of FIG. 7 is also illustrated as a grid, whichidentifies the data values actually stored in the volume at anyparticular point in time. Upper case letters are used, as they were inFIG. 3, to identify active data on the volume that has value, namely,data that has not been deleted or designated for deletion and iscurrently “in use.” In addition, the first time any new data is added tothe volume, it is shown in bold. In contrast, lower case lettersresiding on the volume represent memory space on the volume that isavailable for use. For example, volume addresses 1 through 4 at time 1contain data “a” through “d,” respectively, each of which represents oldand unwanted data, such as files or information previously subjected todelete commands. The prime symbols marking letters (for example, “H” ataddress 3 at time 6) represent granules of data, which were identifiedas being on the volume when a snapshot is taken but which have not yetbeen recorded to snapshot cache. The letters marked with a prime symbol,therefore, represent data that are “primed” for recording to a snapshotcache prior to any replacement (overwriting). As will be discussedhereinafter, both data in use (upper case letters) and data understoodas deleted (lower case letters) can be primed for cache recording.Finally, column 735 identifies the data actually stored in the volume asof time 20.

The fourth section 740 of FIG. 7 graphically illustrates each snapshotspecific cache created in accordance with the methods of the presentinvention. For illustrative purposes, only the three snapshot specificcaches corresponding to the first, second, and third snapshots taken attimes 6, 11, and 15, respectively, are shown. As was done in FIG. 3,each snapshot specific cache is illustrated in two different manners: assnapshot specific cache grids 742, 744, 746, which shows how eachsnapshot cache changed over time, and, in column 750, which shows thecurrent states of each such snapshot specific caches 752, 754, 756. Itshould be recalled that the first snapshot specific cache 752 becamefixed as of time of the second snapshot shown in this FIG. 7, namely, attime 11, and that the second snapshot specific cache 754 became fixed asof time of the third snapshot shown in this FIG. 7, namely, at time 15.Finally, the third snapshot cache 756 is still in the process of beingdynamically created as of time 20 and will not actually become fixeduntil a fourth snapshot (not shown) is taken at some point in thefuture. Thus, even though cache 356 has not yet become fixed, it canstill be accessed and, as of time 20, contains the data as shown.

Further, it should be understood that the shaded granules in each of thesnapshot specific caches 752, 754, 756 merely indicate that no data waswritten or has yet been written to that particular address when thatparticular cache was permanently fixed in time (for caches 752, 754) oras of time 20 (for cache 756); thus, no additional memory of the datastorage medium has been used or was necessary to create the caches 752,754, 756. Stated another way, only the data shown in the fifth sectionof FIG. 7, table 760, is necessary to identify the first three snapshotcaches 752, 754, 756 as of time 20.

Although it should be self evident from FIG. 7 how data is written to ordeleted from the volume and the impact such writes and deletes have onthe cache in light of when snapshots are taken, it will nevertheless behelpful to examine the impacts of each write and delete command shown insection 720 on the system on a time point by time point basis.

Now, proceeding with the time point by time point analysis of FIG. 7, attime 1, the data values stored in addresses 1 through 4 of the volumeare previously set to “abcd,” which are undesired data (because they arelower case).

At time 2, commands to write data “E” to address 2 and data “F” toaddress 3 are received. At time 3, a command to write data “G” toaddress 4 is received. Data “E” is written to address 2 at time 3,replacing data “b”; data “F” is written to address 3 also at time 3,replacing data “c”; and data “G” is written to address 4 at time 4,replacing data “d.” Data “b” and “c” and “d” are not written to anysnapshot cache for either of two reasons: they are lower case, whichmeans they are undesirable and do not need to be cached, and they havebeen overwritten prior to the first snapshot and thus do not get cached.

At time 4, a command to write data “H” to address 3 is received. Data“H” is written to address 3 at time 5, replacing data “F.” It should benoted that data “H” merely replaces data “F” in the volume.

At time 4, a command to delete the data stored at address 2 is received.Thus, data “E” becomes data “e” at time 4 in the volume. Thus, at time6, when the first snapshot is taken, the values of the volume are“aeHG.” Data “H” and “G” are now “primed,” as denoted by the primesymbol to indicate that such data should be written to cache if they areever overwritten by different data. As will become apparent, it is notnecessary to write such data to cache if it is merely designated fordeletion because it will still be accessible at its respective addresslocation on the volume until it is actually overwritten.

It should be noted that although the snapshot has been taken at time 6,there is no need, yet, to record any of the (upper case) data in thevolume to snapshot cache because the current volume accurately reflectswhat the state of the volume is or was at time 6. Since the volume isstill the same as it was at time 6, nothing changes at time 7.

The second snapshot is taken at time 11. The volume at that point is“aeIG.” Data “I” is now “primed,” as denoted by the prime symbol, anddata “G” remains primed. Again, as stated previously, it is at thispoint that the first snapshot cache 752 is permanently fixed. It is nolonger necessary to add any further information to this first snapshotcache 742.

At time 13, a command to delete the data stored at address 4 isreceived. Thus, data “G” becomes data “g” in the volume at time 13. Thethird snapshot is taken at time 15. The volume at that point is “aeIg.”Data “I” remains “primed” and data “g,” although now designated as readyfor deletion, also remains primed. Again, as stated previously, it is atthis point that the second snapshot cache 754 is permanently fixed (withno data stored therein). It is no longer necessary to add any furtherinformation to this second snapshot cache 744.

Then, at time 17, a command to write data “J” to address 4 is received.Data “J” will be replacing data “g,” again, which has already beendesignated for deletion. However, because data “g” was part of bothsnapshots 2 and 3, data “J” is not immediately written to this address.The copy on write process is performed so that data “G” is written tothe third snapshot cache at time 18 as shown in grid 746. Once data “G”has been written to the third snapshot cache 746, data “J” can be safelywritten to address 4 of the volume at time 19.

Finally, it should be noted that data “I,” which is part of two of thesnapshots remains primed because it has not yet been overwritten and,thus, has not yet been written to cache during the time duration of FIG.7.

Turning briefly to FIG. 8, a state diagram 800, illustrates the variousstates an exemplary data “K” may go through according to the processdescribed in FIG. 7.

Turning now to FIG. 9, a method 900 for performing the series ofoperations 700 from FIG. 7 are illustrated. First, the system waits(Step 910) until a command is received from the system, from anadministrator of the system, or from a user of the system. If a commandto take a snapshot is received (Step 920), then a new snapshot cache isstarted (Step 930), all in use data (i.e., data in upper case lettersusing the convention of FIG. 7) on the volume is primed (Step 935) forlater caching, and the previous snapshot cache, if one exists, is ended(Step 940). The process then returns to Step 910 to wait for anothercommand.

If the determination in Step 920 is negative, then the system determines(Step 950) whether a command to write new data to the volume has beenreceived. If so, then the system determines (Step 960) whether the dataon the volume that is going to be overwritten needs to be cached (i.e.,has the data been “primed” ?). For example, from FIG. 7, only data “H”and “G” needed to be cached. If the determination in Step 960 ispositive, then the data to be overwritten on the volume is written (Step970) to the current snapshot cache. If the determination in Step 960 isnegative or after Step 970 has been performed, then the new data iswritten (Step 980) to the volume. The process then returns to Step 910to wait for another command.

If the determination in Step 950 is negative, then the system determines(Step 990) whether a command to delete data from the volume has beenreceived. If not, then the process returns to Step 910 to wait foranother command. If so, then the system designates or indicates (Step995) that the particular volume data can be deleted and the associatedspace on the volume is available for new data. The process then returnsto Step 910 to wait for another command.

Turning now to FIGS. 10 a and 10 b, a fourth set of operations 1000 a,1000 b, respectively, (“Create First and Second Modified HistoricalVolumes”) are shown in which a “create modified volume at a snapshotmoment” command is received. The system (i) reconstructs what the volumelooked like at an historical point in time at which the respectivesnapshot was taken and then (ii) enables such volume to be modified.Modifications to such volumes may be made directly by a systemadministrator or system user at the granule level of the cache; however,more than likely, modifications are made at a system administrator userinterface level or at an interface level of the system user. Suchmodifications at the interface level are then mapped by the system tothe granule level of the cache. The process of making modifiedhistorical volumes will now be discussed in greater detail.

FIGS. 10 a and 10 b are divided generally into three separate butrelated sections 1010, 1030, 1020. Turning first to FIG. 10 a, the firstsection 1010 illustrates a timeline or time axis. This timeline 1010 isthe same as the timeline 310 previously discussed in FIG. 3. As will berecalled, the first snapshot from FIG. 3 was taken at time 5 and, forease of reference, is shown again in FIG. 10 a. The second section 1030of FIG. 10 a graphically illustrates the volume, as it existed in thepast, and the data stored therein at any particular point in time alongtimeline 1010. Again, this historical volume grid 1030 is identical tothe volume grid 330 from FIG. 3. The third section 1020 of FIG. 10 agraphically illustrates the operations that are performed by the systemto “create a modified historical volume.” From previous discussions, itwill be appreciated that snapshot caches 1052,1054, and 1056 are readonly. In order to make them read/write (or at least to appear read/writeat the system administrator or system user level), the system createscorresponding write snapshot caches 1062, 1064, and 1066. When created,these write snapshot caches 1062, 1064, and 1066 are empty (i.e., allgranules are shaded to illustrate that no data is contained therein). Aspreviously stated, the system enables data to be written to particularaddresses of such write snapshot caches 1062, 1064, and 1066 eitherdirectly or after mapping of data modifications from the user interfacelevel to the cache granule level. For purposes of this example and asshown in FIGS. 10 a and 10 b, write snapshot caches 1062 and 1064 eachhave data already written to particular addresses therein.

The process of creating a modified first historical volume 1070 then isquite similar to the process of recreating an actual historical volume,as illustrated by column 670 from FIG. 6 a. For example, column 1037identifies what data was originally contained in the volume at time 5,when the first snapshot was taken. The system could recreate suchinformation based on its access to the data from the current volume1035, as it exists immediately after time 22, and to the read onlysnapshot caches 1052, 1054, and 1056.

The process of creating the modified first historical volume, however,starts first with the write snapshot cache corresponding to the snapshotto which the system is being reverting. In FIG. 1 Oa, the system startswith write snapshot cache 1062. If any data exists in any addresstherein, it is immediately written to the modified historical volume1070 at the corresponding address location (in this case, addresses 1through 3 are written directly from the write snapshot cache 1062 data).From then on, the read process described in FIG. 6 a is followed foreach remaining address location. In this case, only address 4 needs tobe recreated. Thus, after the above procedures are performed, column1070 does not match column 1037 except at address 4.

Likewise, turning to FIG. 10 b, the ability to create a modified secondhistorical volume 1080 then is quite similar to the process ofrecreating an historical volume, as illustrated by column 680 from FIG.6 b. Caches 1052 and 1062 are ignored. The system starts with writesnapshot cache 1064. If any data exists in any address therein, it isimmediately written to the modified historical volume 1080 at thecorresponding address location (in this case, addresses 1 and 4 arewritten directly from the write snapshot cache 1064 data). From then on,the read process described in FIG. 6 b is followed for each remainingaddress location. In this case, address 2 data “E” is ultimatelyobtained from the current volume 1035, as it exists immediately aftertime 22. Address 3 data “F” is ultimately obtained from read onlysnapshot cache 1056. Thus, after the above procedures are performed,column 1080 does not match column 1038 except at addresses 2 and 3.

Turning briefly to FIG. 11, an exemplary method 1100 for performing copyon write (COW) procedures, in a preferred manner, is illustrated. Suchmethod provides a fairly secure or safe method of performing such copyon write procedures that ensures that no information is lost orprematurely cached or overwritten in the process, even in the event of apower failure or power loss in the middle of such procedure.

Specifically, the system waits (Step 1110) for a request to replace ablock of data on the volume. Step 1110 is triggered, for example, when acommand to write old data to cache is received (as occurs in Step 570 ofFIG. 5), when a request to write primed data to the current snapshot isreceived (as occurs in Step 970 of FIG. 9), or the like. When thisoccurs, the old or primed data is read (Step 1115) from the volumeaddress.

The system then checks (Step 1120) to determine whether a fault hasoccurred. If so, the system indicates (Step 1170) that there has been afailure, and the write on copy process is halted. If the determinationin Step 1120 is negative, then the system writes (Step 1125) the old orprimed data to the current snapshot cache.

Again, the system then checks (Step 1130) to determine whether a faulthas occurred. If so, the system indicates (Step 1170) that there hasbeen a failure, and the copy on write process is halted. If thedetermination in Step 1130 is negative, then the system determines (Step1135) whether the snapshot cache is temporary. If so, then the systemmerely writes (Step 1150) an entry to the memory index. If the snapshotcache is not temporary, then the system writes (Step 1140) an entry tothe disk index file.

Again, the system then checks (Step 1145) to determine whether a faulthas occurred. If so, the system indicates (Step 1170) that there hasbeen a failure, and the copy on write process is halted. If thedetermination in Step 1145 is negative, then the system also writes(Step 1150) an entry to the memory index.

Finally, the system again checks (Step 1155) to determine whether afault has occurred. If so, the system indicates (Step 1170) that therehas been a failure, and the copy on write process is halted. If thedetermination in Step 1155 is negative, then the system indicates (Step1160) that the write to the cache was successful and the system thenallows the new data to be written to the volume over the old data thatwas cached.

As will be apparent from the foregoing detailed description, thispreferred embodiment of a method of the present invention provides ameans for taking and maintaining a snapshot that is highly efficient inits consumption of the finite storage capacity allocated for thesnapshot data, even when multiple snapshots are taken and maintainedover extended periods of time.

Exemplary System Administrator and User Interfaces

Before continuing with the detailed description of further aspects,systems and methodologies of the present invention, it will be useful toquickly examine a number of system administrator and system userinterfaces, in FIGS. 12 through 32, that provide one preferred means forinteracting with the snapshot system of the present invention.

Turning first to FIG. 12, a screen shot illustrates a preferred controlpanel for use with the present invention. The control panel includesbuttons and folders across the top of the page and links within the mainwindow. Specifically, a link to “Global Settings” forwards the user toFIG. 13; a link to “Schedules” forwards the user to FIGS. 14-16; a linkto “Volume Settings” forwards the user to FIGS. 17-18; a link to“Persistent Images” forwards the user to FIGS. 19-23; a link to “RestorePersistent Images” forwards the user to FIGS. 24-26; folder “Disks andVolumes” takes the user to FIGS. 27-31; and button “Status” at the topof the page forwards the user to FIG. 32.

FIG. 13 illustrates a screen shot of the Global Settings page. Thevariables that are modifiable by the user are shown in the main window.

FIG. 14 illustrates a screen shot of the Schedules page. This page showswhat snapshots are currently scheduled to be taken and relevantparameters of the same. The button on the right called “New” allows theuser to schedule a new snapshot, which occurs on the page shown in FIG.15. The button on the right called “Properties” enables the user to edita number of properties and variables associated with the specificscheduled snapshot selected by the box to the left of the page, whichoccurs on the page shown in FIG. 16. The button on the right called“Delete” allows the user to delete a selected schedule.

FIG. 17 illustrates a screen shot of the Volume Settings page. This pagelists all available volumes that may be subject to snapshots. Byselecting one of the listed volumes and the button on the right called“Configure,” the user is taken to the screen shot shown in FIG. 18, inwhich the user is enabled to edit configuration settings for theselected volume.

FIG. 19 illustrates a screen shot of the Persistent Images page. Thispage lists the persistent images currently being stored on the system.The user has several button options on the right hand side. By selecting“New,” the user is taken to the page shown in FIG. 20, in which the useris able to create a new persistent image. By selecting “Properties,” theuser is taken to the page shown in FIG. 21, in which the user is able toedit several properties for a selected persistent image. By selecting“Delete,” the user is taken to the page shown in FIG. 22, in which theuser is able to confirm that he wants to delete the selected persistentimage. Finally, by selecting “Undo,” the user is taken to the page shownin FIG. 23, in which the user is able to undo all changes (e.g.“writes”) to the selected persistent image. Choosing “OK” in FIG. 23resets the persistent image to its original state.

FIG. 24 illustrates a screen shot of the Persistent Images to Restorepage. This page lists the persistent images currently being stored onthe system and to which the user can restore the system, if desired. Theuser has several button options on the right hand side. By selecting“Details,” the user is taken to the page shown in FIG. 25, in which theuser is presented with detailed information about the selectedpersistent image. By selecting “Restore,” the user is taken to the pageshown in FIG. 26, in which the user is asked to confirm that the userreally wants to restore the current volume to the selected snapshotimage.

FIG. 27 illustrates a screen shot of the front page of the Disks andVolumes settings. By selecting “persistent Storage Manager,” the user istaken to the page shown in FIG. 28, which displays the backup schedulecurrently being implemented for the server or computer. The user hasseveral buttons on the right hand side of the page from which to choose.By selecting the “Properties” button, the user is user is taken to thepage shown in FIG. 29, in which the user is able to specify when, where,and how backups of the system will be taken. By selecting the “CreateDisk” button, the user is taken to the page shown in FIG. 30, in whichthe user is able to request that a recovery disk be created. Therecovery disk enables the user or system administrator to restore avolume in case of catastrophe. By selecting the “Start Backup” button,the user is taken to the page shown in FIG. 31, in which the user isable to confirm that he wants to start a backup immediately.

FIG. 32 merely illustrates a screen shot of the Status page presented,typically, to a system administrator. This page lists an overview ofalerts and other information generated by the system that may be ofinterest or importance to the system administrator without requiring theadministrator to view all of the previously described screens.

Hide and Unhide

In accordance with a feature of a preferred method and system of thepresent invention, a volume address may be omitted from futuresnapshots, or hidden, as indicated by “−” in FIG. 33. It will beappreciated from a review of FIG. 33 that when a volume location(address) is identified as no longer being subject to a snapshot, dataat that location is not preserved before being replaced upon a write tothat location even if there was a snapshot taken of the volume betweenthe time that the hide (or “omit” command) was made and the subsequentwrite occurred. Furthermore, it will be apparent from a review of FIG.33 that a granule is not cached simply because an unhide command isgiven (indicated by a “+” in FIG. 33) and then a write at that addressoccurs prior to any snapshot being taken. Conversely, if a granule needscaching at a location to which a hide command is given, then thatgranule is cached. It will also be apparent to one of skill in the artthat, when taking a snapshot, the prime bit is not set for an addressthat is hidden.

Tracking of Snapshot Data

Snapshot data is tracked in order for the correct granule to be returnedin response to reads from the snapshot. A logical structure for trackingsnapshot data is illustrated in FIG. 34. A Header file is maintained onthe volume (but is excepted from the data preservation method) and isutilized to record therein information about each snapshot.Specifically, the Header file includes a list of Snap Master records,each of which includes one or more Snapshot Entries. Each Snap Masterrecord corresponds to a data group (e.g., snapshots of multiple volumestaken at the same time) and, in turn, each Snapshot Entry corresponds toa snapshot of one of the volumes. Each Snapshot Entry includes IndexEntries referenced by an Index file, which for respective snapshots mapvolume addresses to cache addresses where snapshot data has been cached.The physical structure of the Header file, Index file, Cache file (alsoreferred to as a “diff” file), and volume are illustrated in FIG. 35.Basically, the Header file, Index file, and cache are all that isrequired to locate the correct snapshot data for a given snapshot.Furthermore, the Header file, Index file, and Cache file all comprisefiles so that upon a powering down of the computer, the information isnot lost. Indeed, the updates to these files also is conducted in amanner so that upon an unexpected powering down or system crash during awrite to the Header file, Index file, or cache, or committing of a writeto the volume that replaces snapshot data, the integrity of the volumeis maintained.

Snapshot Delete and Cache Scavenge

In another aspect of the present invention, there may be times when itis necessary or desirable to delete snapshots being maintained by thesystem of the present invention. Snapshot deletion requires some actionsthat are not required in less sophisticated systems. Since each snapshotmay contain data needed by a previous snapshot, simply releasing theindex entries (which are typically used to find data stored on thevolume or in cache), and “freeing up” the cache granules associated withthe snapshot, may not work. As will be recalled from the abovediscussions, it is sometime necessary to consult different snapshotcaches when trying to read a particular snapshot; thus, there is a needfor a way to preserve the integrity of the entire system when deletingundesired snapshots.

The present invention processes such deletions in two phases. First,when a snapshot is to be deleted, the snapshot directory is unlinkedfrom the host operating system, eliminating user access. The SnapshotMaster record and each associated Snapshot Entry are then flagged asdeleted. Note that this first phase does not remove anything needed by apreviously created snapshot to return accurate data.

The second, or “scavenger,” phase occurs immediately after a snapshot iscreated, a snapshot is deleted, and a system restart. The scavengerphase reads through all Snapshot Entries locating snapshots that havebeen deleted. For each snapshot entry that has been deleted, a search ismade for all data granules associated with that snapshot that are notprimed or required by a previous snapshot. Each such unneeded granule isthen released from the memory index, the Index File, and the cache file.Other granules that are required to support earlier snapshots remain inplace.

When the scavenger determines that a deleted snapshot entry contains noremaining cache associations, it is deleted. When the last snapshotentry associated with a snapshot master entry is deleted, the snapshotmaster is deleted.

Persistence: Snapshot Reconstruction

In another aspect of the present invention, when the system computer isrestarted after a system shutdown (whether intentional or through asystem failure), the Header and Index files are used to reconstruct thedynamic snapshot support memory contents.

On restart, the memory structures are set to a startup state. Inparticular, a flag is set indicating that snapshot reconstruction isunderway, the primed map is set with all entries primed, and the cachegranule map set to all entries unused. The Header File is then consultedto create a list of Snapshot Master entries, Snapshot Entries, andaddress of the next available cache file granule.

During the remainder of the reconstruction process, writes may occur tovolumes that have active snapshots. Prior to completion of snapshotreconstruction, granule writes to blocks that are flagged prime arecopied to the end of the Cache file and recorded in the memory index.The used cache granule map and next available granule address arelikewise updated. One skilled in the art will appreciate that settingthe prime table to all primed and writing only to the end of the granulecache file will record all first writes to the volume. At this phase,some redundant data is potentially preserved while the prime granule mapis being recreated.

Each index entry is consulted in creation order sequence. Blank entries,entries that have no associated Snapshot Entry, and entries that are notassociated with a currently available volume device are ignored. Eachother entry is recorded in the memory index. If any duplicate entriesare located, the subsequently recorded entry replaces the earlier entry.An entry is considered a duplicate if it records the same snapshotnumber, volume granule address, and cache granule address. The age ofeach index entry is indicated by a time stamp or similar construct whenthe entry was originally created.

At this stage in reconstruction, the index in memory is completed. Eachsnapshot will then be consulted to create the single system wide primedgranule map and used cache map.

For each memory index entry for the snapshot the associated primedgranule map element is cleared and the granule cache map entry set.

On completion the flag indicating snapshot reconstruction is reset. Thecache granule map, primed map, memory index, and file index have beenrestored to include the state at shutdown, as well as all preservedvolume writes that occurred during the reconstruction process.

Restoration of System to Another State

A preferred embodiment of the present invention also provides restorefunctionality that allows restoration of a volume to any state recordedin a snapshot while retaining all snapshots. This is accomplished bywalking through the index while determining which granules are beingprovided by the cache for the restored snapshot. Those volume granulesare replaced by the identified granules from cache. This replacementoperation is subject to the same volume protection as any other volumewrites, so the volume changes engendered by the restore are preserved inthe snapshot set. FIG. 36 illustrates steps in such a restore operation.

The operation begins at Step 3702 when a restore command is received. InStep 3704 a loop through all volume granule addresses on the system isprepared. At Step 3706 the next volume granule address is read. At Step3708 a process restores the selected granule by searching for theselected granule in each snapshot index commencing with the snapshot tobe restored (Step 3712) and ending with the most recent snapshot (Step3716). The process 12 and 3714 establishes index and end counters totraverse the snapshots. Block 3716 compares the index “i” to thetermination value “j”. If the comparison indicates that all relevantsnapshots have been searched the current volume value is unchanged fromthe restoration snapshot and the process returns to 3708. Block 3718determines if the selected granule has been cached for the selectedsnapshot. If so the process continues at 3722 replacing the volumegranule data with the located cache granule data and continuing to 3708.If the granule is not located in 3718 then block 3720 will increment thesnapshot index “i” and continue execution at 3714.

The user experience in restoring the system to a previous snapshot isillustrated by screenshots in FIGS. 37 through 42. In FIG. 37, asnapshot has been taken at 12:11 PM of volumes E and F. Another snapshotis taken at 12:18 PM of volumes E and F as shown in FIG. 38.Furthermore, prior to the 12:18 PM snapshot but after the 12:11 PMsnapshot a folder, titled “New Folder” was created on both volumes E andF, as shown in FIG. 40. Following the 12:18 PM snapshot, the userdecides to restore the system to the state in which it existed at 12:11PM. The user is presented a screen to confirm his intention to performthe restore operation as shown in FIG. 39. FIG. 41 illustrates the stateof the system prior to the restore and FIG. 42 illustrates the state ofthe system following the restore. As will be noted, volume E and F nolonger contain “new folder” that was created after the 12:11 PMsnapshot; however, it should be noted that this folder does appearwithin the folder for the 12:18 PM snapshots of volumes E and F. Thisfolder, and any data contained therein, can be read and copied therefrominto the current state of the system (i.e., the 12:11 PM state) eventhough the folder and data therein was not created until some time after12:11 PM. Additionally, in accordance with a further feature of theinvention, the user also could “restore” the system to the state that itwas in when the 12:18 PM snapshot was taken, even though currently inthe earlier, 12:11 PM state.

To insure against inadvertent reversions, an initiation sequencepreferably is utilized in accordance with preferred embodiments of thepresent invention wherein a user's intention to perform the reversionoperation on the computer system is confirmed prior to such operation.Preferred initiation sequences are disclosed, for example, in copendingWitt International patent application serial no. PCT/US02/40106 filedDec. 16, 2002, and Witt U.S. patent application Ser. Nos. 10/248,425filed Jan. 18, 2003; 10/248,424 filed Jan. 19, 2003; 10/248,425 filedJan. 19, 2003; 10/248,426 filed Jan. 19, 2003; 10/248,427 filed Jan. 19,2003; 10/248,428 filed Jan. 19, 2003; 10/248,429 filed Jan. 19, 2003;and 10/248,430 filed Jan. 19, 2003, each of which is incorporated hereinby reference.

Utilization of Snapshots in New and Useful Ways

In view of the systems and methods of managing snapshots as nowdescribed in detail herein, and as exemplified by the source code of theU.S. provisional patent application and Appendix A that is incorporatedby reference herein, revolutionary benefits and advantages now can behad by utilizing snapshots in many various contexts that, heretofore,simply would not have been practical if not, in fact, impossible.Several such utilizations of snapshots that are enabled by the systemsand methods of managing snapshots disclosed herein, including by theincorporated code, are considered to be part of the present invention,and now are described below.

HDD Data History, Virus Protection, and Disaster Recovery

A conventional hard disk drive (HDD) controller, which may be located ona controller board within a computer or within the physical HDD hardwareunit itself (hereinafter “HDD Unit”), includes the capability to executesoftware. Indeed, controller boards and HDD Units now typically whenshipped from the manufacturer include their own central processing units(CPU), memory chips, buffers, and the like for executing software forprocessing reads and write to and from computer readable storage media.Furthermore, the software in these instances is referred to as“firmware” because the software is installed within the memory chips(such as flash RAM memory or ROM) of the controller boards or HDD Units.The firmware executes outside of the environment of the operating systemof the computer utilizing the HDD storage and, therefore, is generallyprotected against alteration by software users of computers accessingthe HDD and computer viruses, especially if implemented in ROM. Firmwarethus operates “outside of the box” of the operating system. An exampleof HDD firmware utilized to make complete and incremental backup copiesof a logical drive to a secondary logical drive for backup and fail overpurposes is disclosed in U.S. patent application Ser. No.2002/0133747A1, which is incorporated herein by reference.

In accordance with the present invention, computer executableinstructions for taking and maintaining snapshots is provided as part ofthe HDD firmware, such as in a HDD controller board (see FIG. 43) and inthe HDD Unit itself (see FIG. 44). Accordingly, reads and writes tosnapshots in accordance with the present invention are implemented bythe HDD firmware.

Specifically, in FIG. 43, a HDD controller board or card 4404 having theHDD firmware for taking and maintaining the snapshots of the presentinvention (referenced by “PSM Controller”) is shown as controlling diskI/O 4408 to HDD 4410, HDD 4412, and HDD 4414. HDD 4410 illustrates anexample in which the finite data storage for preserving snapshot datacoexists with a volume on the same HDD Unit. HDD 4412 and HDD 4414illustrate an example in which the finite data storage comprises its ownHDD separate and apart from the volume of which snapshots are taken.FIG. 43 also further illustrates the separation of the HDD firmware andits environment of execution from the computer system 4402.

With reference to FIG. 44, the HDD firmware is contained within the HDDUnit 4448 itself, which has a connector 4416 for communication with thecomputer system 4402. The HDD firmware is shown as residing in a diskcontroller circuit 4450 of the HDD Unit 4448. The storage system of theHDD is represented here as logically comprising a first volume 4444,which appears to the operating system of the computer system 4402 and isaccessible thereby, and a second volume 4446 on which the snapshot datais preserved. The second volume 4446 does not appear to the operatingsystem for its direct use.

Optionally, the HDD Unit 4448 includes a second connector 4416 as shownin FIG. 45 for attachment of volume 4420 and volume 4422. Asillustrated, the firmware of the HDD Unit 4448 also takes and maintainssnapshots of each of these additional volumes, the cache data of eachpreferably being stored on the respective HDD.

It should be noted that a security device 4406 is provided inassociation with the HDD controller card 4404 in FIG. 43 and with theHDD controller circuit 4450 in FIGS. 44 and 45. The security devicerepresents a switch, jumper, or the like that is physically toggled by aperson. Preferably, the security device includes a key lock for whichonly an authorized computer user or administrator has a key for togglingthe switch between at least two states (e.g., secure and insecure). Ineither case, when in a first state, the HDD controller receives andexecutes commands from the computer system which otherwise could destroythe data on the volume prior to its preservation in the finite datastorage. Such commands include, for example, as a low level disk format,repartitioning, or SCSI manufacturer commands. Snapshot specificcommands also could be provided for when in this state, whereby anauthorized user or administrator could create snapshot schedules, deletecertain snapshots if desired, and otherwise perform maintenance on andupdate as necessary the HDD firmware. When in a second state, however,the HDD controller would be “cutoff” from executing any such commands,thereby insuring beyond doubt the integrity of the snapshots and thesnapshot system and method.

In a preferred embodiment, approximately 20% of the HDD capacity isallocated for the finite data storage for preserving snapshot data bythe firmware. Accordingly, the data storage for preserving the snapshotdata of a 200 gigabyte HDD, which costs only about U.S. $300 today,would include a capacity of approximately 40 gigabytes, leaving 160gigabytes available to the computer system for storage. Indeed,preferably only 160 gigabytes is presented to the operating system andmade accessible. The other 40 gigabytes of data storage allocated forpreserving the snapshot data preferably is not presented to the computeroperating system.

It is believed that an average use of a computer, such as a desktop forhome or business use, results in approximately a quarter megabyte of netchanges per day for the entire 160 gigabyte HDD (i.e., there is aquarter megabyte difference on average when the HDD is viewed at dayintervals). Preferably, the HDD firmware takes a new snapshot every dayat some predetermined time or at some predetermined event. Under thisscenario, snapshots can be taken and maintained for each forapproximately one hundred and sixty thousand days, or 438 years(assuming the computer continues to be used during this time period).Essentially, a complete history of the state of the computer system asrepresented by the HDD each day automatically can be retained as a builtin function of the HDD! If the snapshots maintained by the firmware areread only, rather than read write, and if the security device inaccordance with preferred embodiments as shown, for example, in FIGS.43, 44, and 45 is utilized, then the snapshots become a complete datahistory unchangeable after the fact by the user, a computer virus, etc.The integrity and security of the snapshots is insured. Indeed, it isbelieved that, because of the isolated execution of the firmware withinthe HDD Unit and protection by the security device from HDD commandsthat otherwise would destroy in wholesale fashion the volume data, theonly way to damage or destroy the snapshots is to physically damage theHDD Unit itself. The high security of the HDD data history, in turn,gives rise to numerous advantages.

First, for instance, as a result of the HDD data history, disasterrecovery can be performed by recovering data, files, etc., from anyprevious day in the life of the HDD Unit. Any daily snapshot throughoutthe life of the HDD Unit is available as it existed at the snapshotmoment on that day. Indeed, the deletion of a file or infection thereofby a computer virus, for example, will not affect that file in anypreviously taken snapshot; accordingly, that file can be retrieved froma snapshot as it exited on the day prior to its deletion or infection.

Furthermore, the files of the snapshots of the HDD data historythemselves can be scanned (remember that each snapshot is represented bya logical container on the base volume presented to the operating systemof the computer) to determine when the virus was introduced into thecomputer system. This is especially helpful when virus definitions arelater updated and/or when an antivirus computer program is laterinstalled following infection of the computer system. The antivirusprogram thus is able to detect a computer virus in the HDD data historyso that the computer system can be restored to the immediately previousday. Files and data not infected can also then be retrieved from thesnapshots that were taken during the computer infection once the systemhas been restored to an uninfected state (remember that a reversion to aprevious state does not delete, release, or otherwise remove snapshotstaken in the intervening days that had followed the day of the state towhich the computer is restored).

This extreme HDD data history also provides enormous dividends forforensic investigations, especially by law enforcement or bycorporations charged with the responsibility of how their employeesconduct themselves electronically. Once a daily snapshot is taken by theHDD firmware, it is as good as “locked” in a data vault and, inpreferred embodiments, is unchangeable by any system user or software.The data representing the state of the HDD for each previous day isrevealed, including email and accounting information. Furthermore,unless a user is expressly made aware of the snapshot functionality ofthe HDD firmware, or unless a user is permitted to explore the“snapshot” folder preferably maintained on the root directory of thevolume, the snapshots will be taken and maintained seamlessly withoutthe knowledge of the user. Only the computer administrator need know ofthe snapshots that occur and, preferably with physical possession of thekey to the security device, the administrator will know that thesnapshots are true and secure.

The same benefits are realized if the HDD Unit is used in a file server,or if the HDD Unit is used as part of network attached storage. Forexample, forty average users of a 200 gigabyte HDD would each haveaccess to HDD data history representing the state of their data as itexisted for each day over a ten year period. In order to protect againstphysical damage to the HDD Unit, data of the HDD Unit can beperiodically backed up in accordance with conventional techniques,including the making of a backup copy of one of the snapshots itselfwhile continued, ongoing access to the HDD is permitted.

In continuing with the HDD data history example, the snapshots can belayered by taking additional snapshots at a different, periodicinterval. Accordingly, at the end of each week, a snapshot can be takenof the then current snapshot of that day of the week to comprise the“weekly” snapshot “series” or “collection.” A weekly snapshots seriesand a monthly snapshot series then can be maintained by the HDDfirmware. Presentation of these series to a user would include within a“snapshot” folder on the root directory two subfolders titled, forexample, “weekly snapshots” and “daily snapshots.” Within the “weeklysnapshots” would appear a list of folders titled with the date of theday comprising the end of the week for each previous week, and withineach such folder would appear the directory structure of the base volumein the state as it existed on that day. Within the “daily snapshots”would appear a list of folders titled with the date of each day for theprevious days, and within each such folder would appear the directorystructure of the base volume in the state as it existed on that day.This layering of the snapshots could further include a series of“monthly snapshots,” a series of “quarterly snapshots,” a series of“yearly snapshots,” and so on and so forth. It should be noted thatlittle additional data storage space would be consumed by taking andmaintaining these different series of snapshots.

If desired, the data storage for preserving the snapshots could bemanaged so as to protect against the unlikely event that the datastorage would be consumed to such an extent that the snapshot systemwould fail. Preferred methods for managing the finite data storage aredisclosed, for example, in copending Green U.S. patent application Ser.Nos. 10/248,460; 10/248,461; and 10/248,462, all filed on Jan. 21, 2003,and each of which is incorporated herein by reference.

Accordingly, but for protection against physical damage to the HDD Unititself, such as damage by fire or a baseball bat, all of the benefits ofconventional snapshots and backups are realized without the time andstorage capacity constraints by the seamless integration into the HDDfirmware of the systems and methods present invention. Indeed, thetaking and maintaining of the snapshots is unnoticeable to the casualeye.

Temporal Database Management and Analysis, National Security/HomelandDefense, and Artificial Intelligence

Much academic and industry discussion has been focused in recent yearson how to incorporate time as a factor in database management. See, forexample, “Implementation Aspects of Temporal Databases,” Kristian Torp,http://www.cs.auc.dk/ndb/phd_projects/torp.html (copyrighted 1998,2000); “Managing Time in the Data Warehouse,” Dr. Barry Devlin, InfoDB,Volume 11, Number 1 (June 1997); and “It's About Time! SupportingTemporal Data in a Warehouse,” John Bair, InfoDB, Volume 10, Number 1(February 1996), each of which is incorporated herein by reference.

As recognized by Kristian Torp, for example, multiple versions of dataare useful in many application areas such as accounting, budgeting,decision support, financial services, inventory management, medicalrecords, and project scheduling, to name but a few. Temporal relationaldatabase management systems (DBMSs) are currently being designed andimplemented that add built in support for storing and querying multipleversions of data and represent improvements to conventional relationalDBMSs that only provide built in support for one (the current) versionof data. Kristian Torp proposes in his thesis techniques fortimestamping versions of data in the presence of transactions.

Furthermore, a debate has arisen between whether time should be takeninto account by database management programs themselves (the“incorporated” model), or whether time should be taken into account byapplications that access the data from database management programs (the“layered” model).

The snapshot method and system of the present invention introduces yet athird, heretofore unknown and otherwise impractical, if not impossible,means for accounting for time as a factor in database management.Indeed, the method of taking and maintaining multiple snapshotsinherently takes time into account, as time inherently is a criticalfactor in managing snapshot data. Thus, by taking and maintainingsnapshots of data, each snapshots represents an instance of that data(it's state at that snapshot time) and the series of snapshots representthe evolution of that data. Moreover, the higher the frequency ofsnapshots, the greater the resolution and less the granularity of theevolution of the data as a function of time. Accordingly by utilizingsnapshot technology preferably as provided by the systems and methods ofthe present invention, non temporal relational database managementsystems can be snapshot on an ongoing basis, with the combination of allthe snapshots data thereof thereby comprising a temporal data store.

Furthermore, within the context of referring to a temporal database, thepresent invention is considered to provide a temporal data storecomprising a plurality of temporal data groups. In this regard, eachtemporal data group is unique to a point in time and includes one ormore snapshots taken at that particular point in time, with the objectof each snapshot comprising (1) a logical container, such as a file, agroup of files, a volume, or portion of any thereof; or (2) a computerreadable storage medium, or any portion thereof. Thus, except in thecase where a data group is writable, all data in a dataset necessarilyshares in common the characteristic that the data is as it existed atthe dataset time point. For example, a snapshot of a first volume at afirst time point and a snapshot of a second volume at that same timepoint, together, may comprise a temporal data group. In juxtaposition,snapshots forming part of a collection or series each is taken at adifferent time point and, therefore, will not coexist within the samedata group as another snapshot of the series, although each snapshot ofthe series will have in common the same object.

As with multiple versions of data in conventional DBMSs, the temporaldata store provided by the present invention efficiently providesmultiple versions of data in the form of snapshot series or collectionsfor analysis in many application areas such as accounting, budgeting,decision support, financial services, inventory management, medicalrecords, and project scheduling. Furthermore, neither an incorporatedarchitecture nor a layered architecture is necessary if the snapshottechnology is utilized for managing and analyzing the temporal data. Aseries of snapshots continuously taken of the data suffices, and neitherdatabase management programs nor specific applications interfacing withsuch database management programs need to specifically be rewritten ormodified to now account for time as a dimension. Running of theapplications in the “current” time while reading the temporal data fromthe various instances of the data contained within the snapshot foldersof the base volume in accordance with the present invention readilyprovides the solution now sought by so many others for accounting fortime as a factor in database management.

Above and beyond providing advantages of conventional DBMSs, thetemporal data store of the present invention further provides theability to conduct multiple “what if” scenarios starting at any snapshotof a data group within a snapshot series. Specifically, because of theadditional cache provided in conjunction for each snapshot for writes tothe snapshot above and beyond the cache provided for preservation of thesnapshot data from the volume, the present invention includes theability to return to the “pristine” snapshot (original snapshot withoutwrite thereto) by simply clearing the write cache. Multiple scenariosthus may be run for each snapshot starting at the same snapshot time(i.e., “temporal juncture” of the various scenarios), and an analysiscan be conducted of the results of each scenario and compared to theothers in contrasting and drawing intelligence from the differentresults. In running the different scenarios, different rule sets can beapplied to each snapshot for each scenario and within the context ofeach snapshot folder without altering the current state of the systemand without permanently destroying the original snapshot. Moreover,because all snapshots are presented in the current state of the system,“what if” scenarios can be conducted on various, different snapshots inparallel. This ability to utilize snapshot technology to a run “what if”scenario on a snapshot, as well as to return to the pristine snapshotand rerun a different “what if” scenario using a different rule set, allwhile doing in parallel a similar analysis on other snapshots, providesa heretofore unknown and incredibly powerful analytical tool for datamining and data exploration. Moreover, by considering consecutivesnapshots in a series in this analysis, data evolution can also beanalyzed from each temporal juncture of the series.

The implications for utilization of the snapshot technology of thepresent invention in intelligence gathering, especially for counterterrorism and national security interest, are staggering. Currently, thestorage capacity required for the ability to run the magnitude ofequivalent scenarios provided by the present invention is impractical ifnot impossible, even for the National Security Administration (or therecently created Department of Homeland Defense). For example, multiplerule sets for data mining and exploration in intelligence gathering cannow be applied to snapshots of the data captured by the governmentalintelligence agencies and different scenarios for each temporal juncturein snapshot series run in parallel. As a result of the presentinvention, for each temporal juncture of each snapshot identified forinvestigation, a system no longer need be restored to its previous stateat a temporal juncture, the scenario executed, the system restored againto the same temporal juncture, and the next scenario executed, and so onand so forth. Consequently, the implications are staggering. Snapshotsexisting on every day between Jan. 1, 2001, and Sep. 11, 2001, of emailtraffic passing through a particular node of the Internet backbone canbe conveniently analyzed under different rules sets and investigativealgorithms to determine which would be more effective and whatinformation could have/was known or available within the data archivesthat might have forewarned authorities to the tragic events thathappened on Sep. 11, 2001.

It will also be apparent to those of ordinary skill in the art that theability to “backtrack” to a previous temporal juncture and execute adifferent rule set also provides enormous advantages and additionalfunctionality to artificial intelligence.

In summary, revolutionary advancements in data analysis and intelligencecan now be had in areas such as medical information analysis (especiallypatient information analysis); financial analysis, including financialsmarket analysis; communications analysis (such as email correspondence),especially for intelligence pertaining to terrorism and other nationalsecurity/homeland defense interests; and Internet Archiving andanalysis. In each of these examples, the relevant data in the state asit existed for points in time can be readily analyzed online byappropriate algorithms, routines, and programs, especially thoseutilizing artificial intelligence and backtracking techniques.

Backups

While it will now be readily evident that the methods and systems fortaking and maintaining snapshots of the present invention far exceedsthe mere use of a snapshot for creation of a backup copy onto somebackup medium, such use of a snapshot nevertheless remains valid. Thus,in accordance with a feature of the present invention, a snapshot of avolume is represented as a logical drive when a backup of that volume isto be made. Thus, the backup program obtains the data of the snapshot byreading from the logical drive and writing the data read there from ontothe backup medium, such as tape. Alternatively, the backup method andsystem of U.S. patent application Ser. No. 2002/0133747A1 is utilized increating a backup. Moreover, a preferred embodiment of the presentinvention includes the combination of the backup method and system ofU.S. patent application Ser. No. 2002/0133747A1 with the inventivesnapshot method and system as generally represented by the code of theincorporated provisional patent application and described in detailabove. Indeed, the backup may be made by reading not from the basevolume itself, but from the most recent snapshot, thereby allowingcontinuous reads and writes to a base volume during the backup process.

In view of the foregoing detailed description of preferred embodimentsof the present invention, it readily will be understood by those personsskilled in the art that the present invention is susceptible of broadutility and application. While various aspects have been described inthe context of backup, database, and data analysis uses, the aspects maybe useful in other contexts as well. Many embodiments and adaptations ofthe present invention other than those herein described, as well as manyvariations, modifications, and equivalent arrangements, will be apparentfrom or reasonably suggested by the present invention and the foregoingdescription thereof, without departing from the substance or scope ofthe present invention. Furthermore, any sequence(s) and/or temporalorder of steps of various processes described and claimed herein arethose considered to be the best mode contemplated for carrying out thepresent invention. It should also be understood that, although steps ofvarious processes may be shown and described as being in a preferredsequence or temporal order, the steps of any such processes are notlimited to being carried out in any particular sequence or order, absenta specific indication of such to achieve a particular intended result.In most cases, the steps of such processes may be carried out in variousdifferent sequences and orders, while still falling within the scope ofthe present inventions. In addition, some steps may be carried outsimultaneously. Accordingly, while the present invention has beendescribed herein in detail in relation to preferred embodiments, it isto be understood that this disclosure is only illustrative and exemplaryof the present invention and is made merely for purposes of providing afull and enabling disclosure of the invention. The foregoing disclosureis not intended nor is to be construed to limit the present invention orotherwise to exclude any such other embodiments, adaptations,variations, modifications and equivalent arrangements, the presentinvention being limited only by the claims appended hereto, or presentedin any continuing application, and the equivalents thereof.

Thus, for example, it is contemplated within the scope of the presentinvention that the finite data storage for preserving snapshot data,while having a fixed allocation in preferred embodiments of the presentinvention, nevertheless may have a dynamic capacity that “grows” asneeded as disclosed, for example, in U.S. Pat. No. 6,473,775, issuedOct. 29, 2002, which is incorporated herein by reference.

1. An invention comprising a method of managing finite data storage of atemporal data store comprising one or more data groups, each data groupcomprising a plurality of members, data of each of which is preserved inthe finite data storage, each data group having associated therewith atime point and each member of each data group having associatedtherewith a preservation weight, the method comprising the step of, upondetecting that consumption of the finite data storage has reached afirst level, then, in order of increasing preservation weight beginningwith the one or more members having the lowest preservation weight,successively deleting each member in increasing chronological orderbeginning with the oldest member first, until the finite data storageconsumption has reached a second level.
 2. A computer-readable mediumhaving computer-readable instructions for performing the method ofclaim
 1. 3. A computer configuration comprising computer-readable mediumhaving computer-readable instructions for performing the method ofclaim
 1. 4. An invention comprising a method of managing finite datastorage used to store data of snapshots, each snapshot having associatedtherewith a snapshot time and a preservation weight, the methodcomprising the step of, upon detecting that consumption of the finitedata storage has reached a first level, then successively deletingsnapshots as a finction of the preservation weights and snapshot timesuntil the finite data storage consumption has reached a second level. 5.The invention of claim 4, further comprising the step of managing acollection of snapshots of the same object, each snapshot being taken ata different point in time and having data preserved in a finite datastorage, by deleting the oldest snapshot of the collection upon theaddition of a new snapshot to the collection when the number ofsnapshots in the collection exceeds a predetermined maximum number.
 6. Acomputer-readable medium having computer-readable instructions forperforming the method of claim
 4. 7. A computer configuration comprisingcomputer-readable medium having computer-readable instructions forperforming the method of claim
 4. 8. A method in which data for multiplesnapshots is maintained without redundancy of preserved data fordifferent snapshots in data storage, comprising: determining whether agranule of a volume requires caching prior to being overwritten; and astep for saving the granule of the volume prior to being overwritten ifit needs caching.